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Abstract 

The explosive growth of information challenges people’s capability in finding out items fitting to their own 
interests. Recommender systems provide an efficient solution by automatically push possibly relevant 
items to users according to their past preferences. Recommendation algorithms usually embody the 
causality from what having been collected to what should be recommended. In this article, we argue 
that in many cases, a user’s interests are stable, and thus the previous and future preferences are highly 
consistent. The temporal order of collections then does not necessarily imply a causality relationship. 
We further propose a consistence-based algorithm that outperforms the state-of-the-art recommendation 
algorithms in disparate real data sets, including Netflix, MovieLens, Amazon and Rate Your Music. 


Introduction 

With the rapid development of Internet [UIJ] , World Wide Web [3l|4] and intelligent mobile phone tech¬ 
nologies EE], our social lives have been greatly changed. At the same time, we are facing inconceivably 
huge amount of information, such as trillions of web pages, billions of e-commerce products and millions 
of videos, largely challenging our information processing capability to effectively find out our interested 
items. Using searching queries as keywords, search engine [iHa] breaks such dilemma via powerful in¬ 
formation retrieval, however, it strongly tends to provide users with popular information while fails to 
match niche items with personalized interests. In addition, it cannot dig out the things you like that are 
not easy to be described by a few searching queries. Under those limitations, recommender systems m 
show excellent performance in providing personalized recommendations. 

Due to the ever-decreasing costs of data storage and processing, recommender systems gradually 
spread to most areas in our lives. Venders utilize our purchase records to recommend relevant products 
to enhance sales social web sites analyze social links to help us find more new friends [121II3], and 
online radio stations remember skipped songs to better serve us in the future M- In general, whenever 
there is plenty of diverse products and customers are not alike, personalized recommendation may help 
to deliver the right content to the right person. This is particularly the case for those Internet-based 
companies that try to make use of the so-called long-tail of products which are rarely purchased but 
due to their multitude they can yield considerable profits m- For example, on Amazon.com, 20% to 
40% sales come from products that do not belong to the shop’s 100,000 most popular products P^ . 
A recommender system may hence have significant impact on a company’s revenues: for example, as 
mentioned by Sanders in the 3rd ACM Conference on Recommender Systems, 60% of DVDs rented by 
Netflix are selected based on personalized recommendations. As discussed in m, recommender systems 
not only help decide which products should be offer to an individual customer, they also increase cross- 
sell by suggesting additional products to the customers and improve consumer loyalty because consumers 
tend to return to the sites that best serve their needs (see |18j for an empirical analysis on the impacts 
of recommendations and consumer feedback on sales at Amazon.com). 

Therefore, driven by the significance in economy and society |19H21) . studies on recommender systems 
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Figure 1. Illustration of causal and consistent recommendation. 


are progressing prosperously, and the design of an efficient recommendation algorithm attracts a wide 
range of interests from engineering science to marketing practice, from mathematical analysis to physics 
community (see the review articles |22H24) and the references therein). Many recommendation techniques 
have been developed, including collaborative filtering [^, content-based analysis [5BH1H], knowledge- 
based analysis [H], time-aware analysis [301131], tag-aware analysis |321l33j . social recommendation [341 
133], constraint-based analysis [30], spectral analysis [33, iterative refinement [38] . principle component 
analysis [30] , hybrid algorithms [40ll^ , diffusion-based algorithms [42H44] , and so on. This work is closely 
related to the diffusion-based methods, which have already found applications in many real e-commerce 
systems, see for example, taobao.com and baifendian.com. Recently, the original methods get improved 
by considering the effects of initial resource distribution [45l|46], correlations biased diffusion [4MI], 
users’ tastes [33], temporal effects [33], and so on. 

In general, a recommender system tries to find out users’ habits and recommends uncollected objects 
to them based on their historical records. Most known recommendation algorithms embodies causality 
relationship, that is, it recommends a certain object because of some already collected objects. In such 
situation, temporal order is a very critical factor. Looking at an simplihed example in hgure 1(a), if the 
target user has read the textbook Algorithm, we will prefer to recommend Data Mining instead of Data 
Structure, since the latter one should be already studied before Algorithm. However, in many cases, such 
as food, music, movies, etc., such relationship does not work and the temporal order of a user’s choices 
do not reflect any causality. As shown in figure 1(b), if the target user has watched the movie Star Trek 
Into Darkness, we can infer he/she likes science hction movies, and recommend both movies The Day 
After Tomorrow and Cloud Atlas, regardless which one should be watched before or after another one. 

As above mentioned, some selections result from causality with temporal order, while others may 
only reflect consistent interests. We argue that the considerable part of selections can be explained by 
consistence, while many known algorithms (e.g., the network-based inference [43] 1 embody the causality 
hypothesis: from what having been collected to what should be recommended. In this article, based 
on consistence, we propose a novel algorithm named consistence-based inference (CBI). We have tested 
our algorithm on four real datasets: MovieLens, Netflix, Amazon and Rate Your Music (RYM). The 
results demonstrate higher accuracy, diversity and novelty of CBI compared with some baseline algo¬ 
rithms: global ranking method (CRM), collaborative hltering (CF), network-based inference (NBI) and 
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Figure 2. An illustration of NBI. Subgraph (a): a bipartite network with objects denoted by rectangles 
and users by circles, and a user is connected with an object if this user has collected this object. 
Subgraph (b): the weights Wij (beside the arrow from node j to node i) corresponding to the above 
bipartite network after the projection from user-object network to object-object network, by Eq. (CD- 


heterogenous network-based inference (HNBI). By integrating the causality and consistence, we further 
propose a so-called unbalanced CBI (UCBI) algorithm, which performs even remarkably better than CBI. 


Results 

A recommender system consists of users and objects, and each user has collected some objects. De¬ 
noting the object set as O = { 01 , 02 , • • • ,o„}, user set as U = {ui,U 2 , • • • ,Um} and link set as E, the 
recommender system can be fully described by an n x m adjacent matrix A = {aij}, where = 1 if Oi 
is collected by Uj, and a^- = 0 otherwise. Accordingly, we can visualize the recommender system as a 
bipartite network G with m + n nodes, where the degrees of an object Oi and user it;, k(oi) and fc(it;), 
respectively represent the number of users who have collected Oi and the number of objects collected by 
user ui- Mathematically speaking, for a given user, a recommendation algorithm generates a rank for all 
the objects he/she has not collected yet and recommends the top-A uncollected objects to this user, with 
L denoting the length of the recommendation list. In this article, we fix L = 50 and we have checked 
that the major conclusions are not sensitive to the value of L. 


















4 


Among many known algorithms, NBI is fast, robust and relatively accurate |23| . which is a good 
choice as the benchmark algorithm since it embodies the causality hypothesis. According to the standard 
NBI [33], given the target user u/, the preference to select an object Oi because of a prior selection of 
another object Oj is defined as 




-i m 

i CLiiGji 

Hoj) ^ k{ui) ’ 


( 1 ) 


which results from a simple random walk from node Oj to node Oi. Denote by / the initial collection vector 
of user ui, where fj = 1 if user ui has collected object Oj and fj = 0 otherwise, the final recommendation 
score of an arbitrary object Oi is the simple sum of contributions Wij, where j runs over all objects Oj 
having already been collected (i.e., fj = 1) by ui, namely 


/' = Wf, 


( 2 ) 


where W = {'Wij)nxn is the asymmetrical weight matrix according to Eq. (1). A simple example 
about how to calculate W is shown in figure 2. The uncollected objects with top-L values in /' will be 
recommended to user m;. 

Notice that, Eq. (1) only accounts for the contribution from a prior selection oj to a possible candidate 
Oi, that is to say, to which extent we would like to recommend Oi because of the prior selection of Oj. 
This is thus a typical causality-based recommendation algorithm. Instead of causality relationship, if 
we consider the consistence between Oj and Oi, we need not only account for the contribution from a 
prior selection Oj to a posterior selection Oi, but also the contribution from Oi to Oj, namely whether the 
co-selection {oi,Oj) reflects a consistent interest of the target user ui. Therefore, corresponding to the 
causality relationship in Eq. (1), a consistence relationship could be 


= w, 


Wj 


E tl 

. 7 ' = ! ^ 




(3) 


where the normalization factor Wj>i is used to make sure the influences from prior selections to 
posterior selections and from posterior selections to prior selections are comparable, namely 




= I. 


(4) 


Denoting the corresponding weight matrix as = {r^j^^)nxn, for the target user ui and his/her 

initial collection vector /, the recommendation score f can be obtained in a similar way as Eq. (2): 

/' = (5) 


Analogous to the standard NBI [33], CBI is also a parameter-free algorithm. 

It is very possible that the strengths of influences from prior selections to posterior selections and from 
posterior selections to prior selections should be different, therefore we further introduce a unbalance 
consistence-based inference (UCBI), where Eq. (3) is modified as 


JJCBI 

ij 


= (wijT + ( 




E n 

7' = 1 ^ 


']j'i 


( 6 ) 


and accordingly 


f = 


(7) 


where = {rYj^^^)nxn- It is not surprised that the introduction of two tunable parameters a and 

fj will improve the algorithm’s accuracy comparing to the standard CBI. In addition to that, we would 
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Table 1. Algorithms’ performance on four data sets. For each algorithm with parameters, the 
performance indices are obtained by optimizing corresponding parameters subject to the largest AUC 
value, with resolution 0.01. Values in the brackets stand for the standard deviations, and the 
best-performed values are emphasized by boldface. The recommendation list is fixed as L = 50 and the 
number of samplings for AUC value is fixed as n = 10®. See Supplementary Information (SI) for results 
on A = 10 and L = 100. 


Movielens 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBl 

UCBI 

0.8569(0.0023) 

0.8990(0.0020) 

0.9093(0.0016) 

0.9145(0.0014) 

0.9249(0.0011) 

0.9339(0.0013) 

0.0508(0.0007) 

0.0638(0.0011) 

0.0670(0.0011) 

0.0693(0.0011) 

0.0705(0.0011) 

0.0816(0.0012) 

0.3419(0.0008) 

0.4227(0.0009) 

0.4431(0.0009) 

0.4584(0.0010) 

0.4651(0.0009) 

0.5334(0.0007) 

0.4085(0.0010) 

0.3758(0.0008) 

0.3554(0.0008) 

0.3392(0.0009) 

0.3348(0.0007) 

0.3067(0.0008) 

0.3991(0.0007) 259(0.4410) 

0.5796(0.0016) 242(0.3724) 

0.6185(0.0013) 234(0.3925) 

0.6886(0.0011) 219(0.4725) 

0.6877(0.0005) 218(0.3034) 

0.8191(0.0001) 176(0.1270) 

Netflix 

AUC 

P 

Recall 

I 

H 

{k) 

GRM 

CF 

NBI 

HNBI 

CBl 

UCBI 

0.8101(0.0028) 

0.8714(0.0021) 

0.8858(0.0019) 

0.8877(0.0020) 

0.9056(0.0014) 

0.9173(0.0012) 

0.0160(0.0002) 

0.0235(0.0003) 

0.0251(0.0003) 

0.0270(0.0004) 

0.0268(0.0004) 

0.0390(0.0003) 

0.0766(0.0003) 

0.1103(0.0004) 

0.1182(0.0004) 

0.1265(0.0005) 

0.1260(0.0005) 

0.1806(0.0001) 

0.3580(0.0021) 

0.3106(0.0009) 

0.2819(0.0008) 

0.2405(0.0006) 

0.2142(0.0005) 

0.1683(0.0003) 

0.1627(0.0004) 520(1.3402) 

0.6787(0.0010) 423(1.2803) 

0.7299(0.0006) 398(1.0763) 

0.8790(0.0003) 312(0.6855) 

0.8314(0.0003) 316(0.9044) 

0.9346(0.0003) 215(0.1430) 

Amazon 

AUC 

P 

Recall 

I 

H 

{k) 

GRM 

CF 

NBI 

HNBI 

CBl 

UCBI 

0.6409(0.0029) 

0.8810(0.0017) 

0.8844(0.0018) 

0.8844(0.0018) 

0.8937(0.0018) 

0.8944(0.0005) 

0.0036(0.00008) 

0.0156(0.0001) 

0.0161(0.0001) 

0.0163(0.0001) 

0.0186(0.0002) 

0.0189(0.0001) 

0.0727(0.00009) 

0.2971(0.0001) 

0.3050(0.0001) 

0.3079(0.0001) 

0.3499(0.0002) 

0.3548(0.0001) 

0.0709(0.0006) 

0.0927(0.0001) 

0.0899(0.0001) 

0.0896(0.0001) 

0.0881(0.0002) 

0.0861(0.0002) 

0.0584(0.0001) 

0.8649(0.0008) 

0.8619(0.0006) 

0.8652(0.0006) 

0.9413(0.0002) 

0.9650(0.0002) 

133(0.3) 

81(0.1938) 

81(0.1775) 

81(0.1689) 

59(0.1088) 

48(0.1800) 

RYM 

AUC 

P 

Recall 

I 

H 

(fc) 

GRM 

CF 

NBI 

HNBI 

CBl 

0.8786(0.0001) 

0.9548(0.0001) 

0.9611(0.0001) 

0.9644(0.0001) 

0.9692(0.0001) 

0.0034(0.00001) 

0.0129(0.00003) 

0.0131(0.00006) 

0.0135(0.00005) 

0.0143(0.00004) 

0.1153(0.00002) 

0.4185(0.00003) 

0.4251(0.00005) 

0.4388(0.00005) 

0.4647(0.00003) 

0.1334(0.0003) 

0.1604(0.00006) 

0.1580(0.0001) 

0.1548(0.00008) 

0.1362(0.00005) 

0.0701(0.00007) 

0.8216(0.00001) 

0.7912(0.00008) 

0.8113(0.00001) 

0.8302(0.00002) 

1343(0.4268) 

1114(0.5895) 

1195(0.7061) 

1154(0.5654) 

1075(0.5654) 


UCBI 0.9704(0.0002) 0.0152(0.00001) 0.4937(0.00002) 0.1207(0.00001) 0.8739(0.00003) 919(0.2900) 


like see: (i) how much the performance of CBl can be further improved, and (ii) the influence from which 
direction is stronger. 

To evaluate the algorithmic performance, we consider six well-known metrics [23]: AUC value (AUC), 
precision (P) and recall (Recall) for accuracy, inter-similarity (/) and Hamming distance (H) for diversity, 
and average degree ((fc)) for novelty (see details in Materials and Methods). For I and (fc), the lower 
the better, while for others the larger the better. We compare CBl and UCBI with four benchmark 
methods (see details in Materials and Methods): global ranking method (GRM), user-based collaborative 
filtering (CF), network-based inference (NBI) and heterogeneous network-based inferenc (HNBI). As 
shown in Table 1, for all three aspects (accuracy, diversity and novelty) and all four data sets, CBl 
largely outperform the four benchmark algorithms, and UCBI can further improve the performance of 
CBL One can thus expect that both the click rate and user experience can be enhanced by applying CBl 
or UCBI. Complementary to Table 1, we plot the precision-recall curves [SUI^ by varying the length of 
recommendation list L. The curve in the right upper position corresponds to higher accuracy. As shown 
in figure 3, for all four data sets, curves show the same order from the left lower to the right upper, 
namely the accuracy order is GRM < CF < NBI < HNBI < CBl < UCBI, supporting the results 
presented in Table 1. 

To see the sensitivity of parameters in UCBI, figure 4 shows some representative curves by fixing a 
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Figure 3. Precision-Recall curves via varying the length of recommendation list L from 1 to the 
cardinality of the testing set. 


while varying (3. Except for Amazon.com, the optimal accuracies obtained by UCBI are much higher 
than those by CBI. In addition, looking at the optimal values (a*,/3*) (see figure 4(e), figure 4(f) and 
Table 2), for all the four data sets, a* is obviously larger than /?*, suggesting that the influence from 
prior selections to posterior selections should be larger than the influence from posterior selections to 
prior selections. 

Table 2. Optimal values of parameters a and (3 for AUC and precision, respectively. 


Data 

^AUC 

Pauc 

ap 

P*P 

Movielens 

0.79 

0.51 

0.70 

0.34 

Netflix 

0.85 

0.60 

0.52 

0.14 

Amazon 

0.83 

0.71 

1.07 

0.99 

RYM 

0.86 

0.73 

0.94 

0.71 


Discussions 

Recommender systems can be mathematically described in a very abstract way as a variant of link 
prediction problem in bipartite networks [saiiT]. However, the underlying decision-making processes for 
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Figure 4. The AUC value (left column) and precision (right column) of UCBI under different 
parameters (a, /3) for (a) MovieLens, (b) Netflix, (c) Amazon, and (d) RYM. The optimal parameters 
are denoted by blue triangles and the parameters corresponding to CBI, (a,/?) = (1,1), are marked by 
black stars. The optimal parameters for AUC and precision are directly shown in (e) and (f), 
respectively. 

























different kinds of recommender systems are far different to each other. For example, the click stream 
on free or cheap products shows different pattern from that on very expensive products, and our choices 
usually contain many unaware biases, such as the anchoring bias and herd behavior caused by other peers’ 
choices and critiques [551160) . Therefore, the causality relationship cannot fully explain the selecting 
behavior of users. In a causality-based recommender system, if the target user has already selected the 
object A, and we need to choose from two candidate recommendations B and B', then the system will 
compare the recommendation strengths from A to B and from A to B'. In this paper, we argue that 
the consistent interests play a major role in determining users’ selections, hence in addition to the above 
operation, we should also compare the recommendation strengths from B to A and from B' to A. In a 
word, only if recommendation strengths from A to B and from B to A are both high, we can infer that 
A and B are consistent for the target user. 

According to extensive experiments on four real data sets, we show that this simple variation can 
remarkably improve the algorithm’s accuracy, diversity and novelty. Numerical investigation suggests 
that the influences from A to B and from B to A are not equal, the former (aligned with the causal 
direction) should be stronger (as indicated by the relationship a* > fi* in all cases). The introduction 
of unbalance further largely improve algorithm’s performance in all three aspects. The consideration 
of recommendation power from unselected objects to selected objects provides a novel viewpoint to the 
traditional recommender systems, we hope this finding can simultaneously bring us better algorithms 
and more insights. 


Materials and Methods 

Data Description 

To verify performances of recommendation algorithms, four benchmark datasets, Movielens, Netflix, 
Amazon and Rate Your Music [RYM) are used, respectivel}!^. In terms of different themes, Movielens, 
Netflix are two famous movies recommendation websites, Amazon is a big globalized online shopping 
store, selling various kinds of commodities, and RYM is a well-known music recommendation website. 
To recommend the appropriate objects, they all leverage ratings to capture users’ preferences, with rating 
from 1 to 5 stars in Movielens, Netflix, and Amazon and from 1 to 10 in RYM. Due to a vast ocean of data 
information and long-tail effects, excellent algorithms are essential to their successful recommendations 
and can further grasp the customers’ loyalty tightly in the websites. For the sake of simplicity and privacy 
protection, we first anonymize the types of goods and names of users, and then recognize preference 
between user and object if the ratings > 3 in Movielens, Netflix, Amazon and > 5 in RYM. That is to 
say, only links associated with relatively high ratings are kept, which may lead to decrease of algorithm’s 
accuracy |6T]. However, this issue is out of the scope of this paper. After processing, primary information 
of the data is summarized in Table [3] 

Table 3. Primary statistics of the four data sets. 


Data 

# Users 

#Objects 

#Links 

Sparsity 

Movielens 

943 

1682 

82520 

6.3 X 10-^ 

Netflix 

10000 

6000 

701947 

1.17 X 10“^ 

Amazon 

3604 

4000 

134679 

9.24 X 10-3 

RYM 

33786 

5381 

613387 

3.37 X 10-3 


^Datasets are achieved from the following web sites: http://www.grouplens.org/ http://www.netflix.com/; 

http://www.amazon.com/; http://rateyourmusic.com/. 
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Metrics 

Before numerical experiments, the link set E is randomly divided into two parts: is the training set 

consisting of 90% links and is the testing set containing the rest 10% links. The reported results are 
not sensitive to the size of training or testing sets, unless one of them is extremely small. Obviously, 
E^ \ E'^ = 0. The links in the testing set are regarded as unknown information and forbidden from 
using in training process. In the following, we introduce six performance indices for algorithms’ accuracy, 
diversity and novelty. 


( 1 ) 


Area Under the ROC Curve (AUC) [62] ■— AUC attempts to measure how a recommender system 
can successfully distinguish the relevant objects (those appreciated by a user) from the irrelevant 
objects (all the others). The simplest way to calculate AUC is by comparing the probability that 
the relevant objects will be recommended with that of the irrelevant objects. For n independent 
comparisons (each comparison refers to choosing one relevant and one irrelevant object), if there are 
n' times when the relevant object has higher score than the irrelevant and n" times when the scores 
are equal, then 


AUC = 


n' + 0.5n" 
n 


( 8 ) 


Clearly, if all relevant objects have higher score than irrelevant objects, AUC = 1 which means a 
perfect recommendation list. For a randomly ranked recommendation list, AUC = 0.5. Therefore, 
the degree of which AUC exceeds 0.5 indicates the ability of a recommendation algorithm to identify 
relevant objects. Notice that, the sole usage of AUC may result in some misleading conclusion |63) . 
therefore we also consider the L-dependent accuracy metrics, precision and recall, and show also the 
precision-recall curves. 


(2) Preeision (P) [25] .— The number of objects recommended to a user is often limited, and even given 
a long recommendation list, users usually consider only the top part of it. For an arbitrary target 
user Ui, the precision of rti, Pi{L), is defined as the ratio of the number of u^s removed links Ri{L) 
(corresponding to relevant selections), contained in the top-L recommendations to L, say: 

R,{L) 


pm = 


(9) 


The precision P{L) of the whole system is the average of individual precisions over all users, defined 
as: 


1 


P{L) = -J2n{L) . 


( 10 ) 


i=l 


The higher precision indicates higher accuracy. 


(3) Recall j^Sj. — Recall considers the ratio of relevant selections that can be recovered in the top-L 
recommendation list. There are two alternative ways to define recall. We can firstly define the recall 
of an individual user Ui as 

Recalim = , ( 11 ) 

where Ef denote the set of links associated with user Ui in the testing set E^ and thus | Ef | is the 
number of Mi’s selections in the testing set. Then, similar to precision, the recall value for the whole 
system is defined as the average value over all users, as 


Recall{L) 


^ UL 

— > Recalli{L) . 
m 

i=l 


( 12 ) 
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We can also directly define recall value as the ratio of relevant objects recovered by all the m recom¬ 
mendation lists, as 

- m 

Recall{L) = ^ Ri{L). (13) 

In this paper, we adopt the latter metric. In addition to the separated comparisons on precision and 
recall, one usually plots the precision-recall curves by varying L to see the overall performance [541155) . 
and the curve in the right upper position indicates higher accuracy. 


(4) Hamming distance {H) |45j . — The algorithm should guarantee the diversity of recommendations, 
viz., different users should be recommended different objects. The intra-diversity can be quantified 
via the Hamming distance. If the overlapped number of objects in Ui and itj’s recommendation lists 
is Q, their Hamming distance is defined as: 


= 1-Q/L, 


(14) 


Generally speaking, a more personalized recommendation list should have larger Hamming distances 
to other lists. Accordingly, we use the mean value of Hamming distance. 


H = 


1 

m(m — 1) 




(15) 


averaged over all user-user pairs, to measure the diversity of recommendations. Note that, H only 
takes into account the diversity among users. 


(5) Intra-similarity (I) [57].— A good algorithm should also make the recommendations to a single user 
diverse to some extent [64) . otherwise users may feel tired for receiving many recommended objects 
under the same topic. Therefore, for an arbitrary target user ui, denoting the recommended objects 
for ui as {oi, 02 ,..., oz,}. Using the S(/)ensen index |^, the similarity between two objects, Oi and Oj 
, can be written as: 

^ m 

yJk{Oi)k{Oj) ^ ^ 


The intra-similarity of uj’s recommendation list can be defined as: 




(17) 


and the intra-similarity of the whole system is thus defined as: 


m ^ 


(18) 


1=1 


(6) Average degree {{k)) — Given Oy is the jt/i recommended object for user Ui, k(pij) represents 

the degree of object o^-, so the popularity is quantified by the average degree of all recommended 
items: 

- m L 

i=l j=l 

The smaller (fc) is preferred since to recommend niche objects usually brings better user experience. 
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Benchmark Methods 

Global Ranking Method (GRM) [55] .— GRM sorts all the objects in the descending order of degree 
after removing the objects that have been collected by the target user, and recommends those L objects 
with the highest degrees. 

Gollaborative Filtering (CF) [^ .— CF is based on measuring the similarity between users or objects. 
Here we consider the user-based CF, and for any two users Ui and Uj , their similarity is defined as the 
S^ensen index (for more local similarity indices as well as the comparison of them, see the Refs. [551157] !: 


1 

y/k{ui)k{uj) 


1 = 1 


( 20 ) 


For any user-object pair Ui — Oj , if Ui has not yet collected Oj (i.e., Uji 
what extent Ui likes Oj), is given as 

E m 
1 = 1 , 

1^1 = 1 , l^i 


0), the predicted score, Vij (to 

( 21 ) 


For any user Ui, all the nonzero Vij with aji = 0 are sorted in a descending order, and those objects in 
the top-L are recommended. 

Heterogenous NBI (HNBI) [45] .— HNBI is a heterogenous network-based inference algorithm de¬ 
pendent on the initial resource nodes’ degrees, as 


wfj = [k{oj)]^w,j, 


( 22 ) 


where Wij is defined according to Eq. (1) and other algorithmic procedures are similar to NBI. 


Acknowledgments 

This work was partially supported by the National Natural Science Foundation of China (Nos. 61302077, 
61433014 and 11222543), National Major Science and Technology Special Project of China (No. 2014AA01A706), 
Funds for Creative Research Groups of China (No. 61121001) and BUPT Excellent Ph.D. Students Foun¬ 
dation (No. CX201433). TZ acknowledges the Program for New Century Excellent Talents in University 
under Grant No.NCET-11-0070 and the Special Project of Sichuan Youth Science and Technology Inno¬ 
vation Research Team (Grant No. 2013TD0006). 


References 

1. Zhang G. Q., Zhang G. Q., Yang Q. F., Cheng S. Q. & Zhou T. Evolution of the internet and its 
cores. New J. Phys. 10, 123027 (2008). 

2. Pastor-Satorras R. & Vespignani A. Evolution and structure of the Internet: A statistical physics 
approach (Cambridge University Press, Cambridge, 2007). 

3. Broder A., Kumar R., Maghoul F., Raghavan P., Rajagopalan S., et al. Graph structure in the 
web. Comput. Netw. 33, 309 (2000). 

4. Doan A., Ramakrishnan R. & Halevy A. Y. Crowdsourcing systems on the world-wide web. Comm. 
ACM 54, 86 (2011). 

5. Goggin G. Cell phone culture: mobile technology in everyday life (Routledge, New York, 2012). 

6. Zheng P.& Ni L. Smart phone and next generation mobile computing (Morgan Kaufmann 2010). 





12 


7. Lawrence S. & Giles C. L. Accessibility of information on the web. Nature 400, 107 (1999). 

8. Jansen B. J. & Pooch U. A review of web searching studies and a framework for future research. 
J. Am. Sod. Inf. Sci. Technol. 52, 235 (2001). 

9. Croft W. B., Metzler D. & Strohman T. Search engines: Information retrieval in practice (Reading: 
Addison-Wesley, 2010). 

10. Resnick P. & Varian H. R. Recommender systems. Comm. ACM 40, 56 (1997). 

11. Linden G, Smith B. & York J. Amazon, com recommendations: Item-to-item collaborative filtering. 
IEEE Internet Comput. 7, 76 (2003). 

12. Ellison N. B., Steinfield C. & Lampe C. The benefits of facebook friends: social capital and college 
studentsuse of online social network sites. J. Comput-Med. Commun. 12, 1143 (2007). 

13. Qian X., Feng H., Zhao G. & Mei T. Personalized recommendation combining user interest and 
social circle. IEEE Trans. Know! Data Eng. 26, 1763 (2013). 

14. Moerchen F., Mierswa 1. & Ultsch A. Understandable models of music collections based on ex¬ 
haustive feature generation with temporal statistics. In Proceedings of the 12th ACM SICKDD 
International Conference on Knowledqe Discovery and Data Minina KDD’06, New York, NY, 
USA, 882-891 (2006). 

15. Anderson C. The long tail: Why the future of business is selling less of more (Hachette Digital, 
Inc., 2008). 

16. Brynjolfsson E., Hu Y. J. & Smith M. D. Consumer surplus in the digital economy: Estimating 
the value of increased product variety at online booksellers. Manage. Sci. 49, 1580 (2003). 

17. Schafer, J. B., Konstan, J., & Riedl, J. Recommender systems in e-commerce. In Proceedings of 
the 1st ACM conference on Electronic commerce EC’99, New York, NY, USA, 158-166 (1999). 

18. Chen, P. Y., Wu, S. Y., & Yoon, J. The impact of online recommendations and consumer feedback 
on sales. ICIS 2004 Proceedings, 58 (2004). 

19. Schafer J. B., Konstan J. A. & Riedl J. E-commerce recommendation applications. In Applications 
of Data Mining to Electronic Commerce, Springer, US, 115-153 (2001). 

20. Huang Z., Zeng D. & Chen H. A comparison of collaborative-filtering recommendation algorithms 
for e-commerce. IEEE Intel. Syst. 22, 68-78 (2007). 

21. Wei K., Huang J. & Fu S. A survey of e-commerce recommender systems. In Service Systems and 
Service Management Conference on, IEEE, 1-5 (2007). 

22. Adomavicius G. & Tuzhilin A. Toward the next generation of recommender systems: A survey of 
the state-of-the-art and possible extensions. IEEE Trans. Know! Data Eng. 17 , 734-749 (2005). 

23. Lii L., Medo M., Yeung C. H., Zhang Y. C., Zhang Z. K., et al. (2012) Recommender systems. 
Phys. Rep. 519, 1-49 (2012). 

24. Shapira B. Recommender systems handbook (Springer, US, 2011). 

25. Herlocker J. L., Konstan J. A., Terveen L. G. & Riedl J. T. Evaluating collaborative hltering 
recommender systems. ACM Trans, on Inf. Syst. (TOIS) 22, 5-53 (2004). 


13 


26. Ansari A., Essegaier S. & Kohli R. Internet recommendation systems. J. Market. Res. 37, 363-375 

( 2000 ). 

27. Pazzani M. J. & Billsus D. Content-based recommendation systems. In The adaptive web, Springer 
Berlin Heidelberg, 325-341 (2007). 

28. Adomavicius G., Sankaranarayanan R., Sen S. & Tuzhilin A. Incorporating contextual information 
in recommender systems using a multidimensional approach. ACM Trans, on Inf. Syst. (TOIS) 
23, 103 (2005). 

29. Trewin S. Knowledge-based recommender systems. Encycl. Libr. Inf. Sei. 69, 69 (2000). 

30. Petridou S. G., Koutsonikola V. A., Vakali A. I. & Papadimitriou G. I. Time-aware web users’ 
clustering. IEEE Trans. Know! Data Eng. 20, 653 (2008). 

31. Gampos P. G., Diez F. & Gantador I. Time-aware recommender systems: a comprehensive survey 
and analysis of existing evaluation protocols. User Modeling and User-Adapted Interaction 24, 
67-119 (2014). 

32. Zhang Z. K., Zhou T. & Zhang Y. G. Tag-aware recommender systems: a state-of-the-art survey. 
J. Comput. Sei. Technol. 26, 767-777 (2011). 

33. Tso-Sutter K. H., Marinho L. B. & Schmidt-Thieme L. Tag-aware recommender systems by fu¬ 
sion of collaborative filtering algorithms. In Proceedings of the 2008 ACM symposium on Applied 
computing, New York, NY, USA, 1995-1999 (2008). 

34. Ma H., Yang H., Lyu M. R. & King I. Sorec: social recommendation using probabilistic matrix fac¬ 
torization. In Proceedings of the 17th ACM conference on Information and knowledge management, 
New York, NY, USA, 931-940 (2008). 

35. Shepitsen A., Gemmell J., Mobasher B.& Burke R. Personalized recommendation in social tagging 
systems using hierarchical clustering. In Proceedings of the 2008 ACM conference on Recommender 
systems. New York, NY, USA, 259-266 (2008). 

36. Felfernig A. & Burke R. Constraint-based recommender systems: technologies and research issues. 
In Proceedings of the 10th international conference on Electronic commerce EC’08, New York, 
NY, USA, 3 (2008). 

37. Maslov S. & Zhang Y. C. Extracting hidden information from knowledge networks. Phys. Rev. 
Lett. 87, 248701 (2001). 

38. Ren J, Zhou T, Zhang YC (2008) Information filtering via self-consistent refinement. EPL 82, 
58007 (2008). 

39. Goldberg K., Roeder T., Gupta D. & Perkins G. Eigentaste: A constant time collaborative filtering 
algorithm. Inf. Retrieval 4, 133-151 (2001). 

40. Burke R. Hybrid recommender systems: Survey and experiments. User modeling and user-adapted 
interaction 12, 331-370 (2002). 

41. Zhou, T., Kuscsik, Z., Liu, J. G., Medo, M., Wakeling, J. R. & Zhang, Y. C. Solving the apparent 
diversity-accuracy dilemma of recommender systems. Proc. Natl. Acad. Sei. USA 107, 4511 (2010). 

42. Zhang, Y. G., Medo, M., Ren, J., Zhou, T., Li, T. & Yang, F. Recommendation model based on 
opinion diffusion. EPL 80, 68003 (2007). 


14 


43. Zhou, T., Ren, J., Medo, M. & Zhang, Y. C. Bipartite network projection and personal recom¬ 
mendation. Phys. Rev. E 76, 046115 (2007). 

44. Zhang Y. C., Blattner M. & Yu Y. K. Heat conduction process on community networks as a 
recommendation model. Phys. Rev. Lett. 99, 154301 (2007). 

45. Zhou T., Jiang L. L., Su R. Q. & Zhang Y. C. Effect of initial configuration on network-based 
recommendation. EPL 81, 58004 (2008). 

46. Jia C. X., Liu R. R., Sun D. & Wang B. H. A new weighting method in network-based recommen¬ 
dation. Physica A 387, 5887 (2008). 

47. Zhou, T., Su, R. Q., Liu, R. R., Jiang, L. L., Wang, B. H. & Zhang, Y. C. Accurate and diverse 
recommendations via eliminating redundant correlations. New J. Phys. 11, 123008 (2009). 

48. Liu R. R., Liu J. G., Jia C. X. & Wang B. H. Personal recommendation via unequal resource 
allocation on bipartite networks. Physica A 389, 3282 (2010). 

49. Liu J. G., Zhou T., Wang B. H., Zhang Y. G. & Guo Q. Degree correlation of bipartite network 
on personalized recommendation. Int. J. Mod. Phys. C 21, 137 (2010). 

50. Lii L. & Liu W. Information filtering via preferential diffusion. Phys. Rev. E 83, 066119 (2011). 

51. Liu J. G., Zhou T. & Guo Q. Information filtering via biased heat conduction. Phys. Rev. E 84, 
037101 (2011). 

52. Liu J. G., Zhou T., Wang B. H., Zhang Y. G. & Guo Q. Effects of user’s tastes on personalized 
recommendation. Int. J. Mod. Phys. C 20, 1925 (2009). 

53. Liu J. & Deng G. Link prediction in a user-object network based on time-weighted resource allo¬ 
cation. Physica A 388, 3643 (2009). 

54. Buckland M. K. & Gey, E. C. The relationship between recall and precision. Journal of the 
Association for Information Science 45, 12-19 (1994). 

55. Davis J. & Goadrich M. The relationship between Precision-Recall and ROC curves. In Proceedings 
of the 23rd international conference on Machine Learning, 233-240, ACM Press, New York, USA 
(2006). 

56. Lii L. & Zhou T. Link prediction in complex networks. Physica A 390, 1150-1170 (2011). 

57. Shang M. S., Lii L., Zhang Y. C. & Zhou T. Empirical analysis of web-based user-object bipartite 
networks. EPL 90, 48006 (2010). 

58. Chen L. & Pu P. Critiquing-based recommenders: survey and emerging trends. User Modeling and 
User-Adapted Interaction 22, 125-150 (2012). 

59. Yang Z., Zhang Z. K. & Zhou T. Anchoring bias in online voting. EPL 100, 68002 (2012). 

60. Huang J., Cheng X. Q., Shen H. W., Zhou T. & Jin X. Exploring social influence via posterior 
effect of word-of-mouth recommendations. In Proceedings of the fifth ACM international conference 
on Web Search and Data Mining, 573-582, ACM Press, New York, USA (2012). 

61. Shang M. S., Lii L., Zeng W., Zhang Y. C. & Zhou T. Relevance is more significant than correlation: 
Information filtering on sparse data. EPL 88, 68008 (2009). 


15 


62. Hanley J. A. & McNeil B. J. The meaning and use of the area under a receiver operating charac¬ 
teristic (ROC) curve. Radiology 143, 29-36 (1982). 

63. Lobo J. M., Jimenez-Valverde A. & Real R. AUC: a misleading measure of the performance of 
predictive distribution models. Global ecology and Biogeography 17, 145-151 (2008). 

64. Ziegler C. N., McNee S. M., Konstan J. A. & Lausen G. Improving recommendation lists through 
topic diversification. In Proceedings of the l^th international conference on World Wide Web, 
22-32, New York, NY, USA, (2005). 

65. Sprensen T. A method of establishing groups of equal amplitude in plant sociology based on 
similarity of species and its application to analyses of the vegetation on danish commons. Biol. 
Skr. 5, 1 (1948). 

66. Liben-Nowell D. & Kleinberg J. The link-prediction problem for social networks. J. Am. Soc. Inf. 
Sci. 58, 1019-1031 (2007). 

67. Zhou T., Lii L. & Zhang Y. C. Predicting missing links via local information. Eur. Phys. J. B 71, 
623-630 (2009). 

Appendix 


Table 4. Analogous to Table 1, but with L = 10. 


Movielens 

AUC 

P 

Recall 

I 

H 

ik) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8570(0.0024) 

0.8990(0.0020) 

0.9092(0.0016) 

0.9146(0.0014) 

0.9249(0.0011) 

0.9339(0.0011) 

0.0818(0.0021) 

0.1232(0.0011) 

0.1283(0.0022) 

0.1350(0.0016) 

0.1350(0.0023) 

0.1669(0.0025) 

0.0987(0.0025) 0.4585(0.0021) 0.5113(0.0002) 365(0.9628) 

0.1487(0.0028) 0.4528(0.0018) 0.7199(0.0037) 326(1.4314) 

0.1549(0.0027) 0.4348(0.0018) 0.7397(0.0032) 317(1.2401) 

0.1630(0.0020) 0.4244(0.0014) 0.7944(0.0017) 299(1.0315) 

0.1630(0.0028) 0.4278(0.0014) 0.7739(0.0027) 305(1.1378) 

0.2015(0.0031) 0.4073(0.0018) 0.8754(0.0009) 259(1.0924) 

Netflix 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8102(0.0028) 0.0246(0.0006) 0.0204(0.0005) 0.3941(0.0023) 0.2256(0.0010) 725(3.4177) 

0.8714(0.0021) 0.0386(0.0008) 0.0320(0.0006) 0.3277(0.0011) 0.7683(0.0017) 529(2.3665) 

0.8858(0.0019) 0.0414(0.0009) 0.0344(0.0007) 0.2915(0.0008) 0.8119(0.0012) 501(2.0976) 

0.8877(0.0020) 0.0438(0.0007) 0.0363(0.0006) 0.2401(0.0007) 0.9404(0.0008) 357(2.0236) 

0.9056(0.0014) 0.0431(0.0009) 0.0358(0.0007) 0.2065(0.0006) 0.8969(0.0007) 367(1.7398) 

0.9173(0.0012) 0.0695(0.0009) 0.0577(0.0008)0.1690(0.0004)0.9657(0.0002) 244(0.9395) 

Amazon 

AUC 

P 

Recall 

I 

H 

{k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.6409(0.0029) 0.0047(0.0003) 

0.8810(0.0017) 0.0320(0.0007) 

0.8844(0.0018) 0.0325(0.0027) 

0.8844(0.0018) 0.0328(0.0007) 

0.8937(0.0018) 0.0398(0.0010) 

0.8967(0.0057) 0.0421(0.0031) 

0.0162(0.0010) 

0.1107(0.0026) 

0.1125(0.0027) 

0.1135(0.0026) 

0.1375(0.0036) 

0.1454(0.0107) 

0.0911(0.0022) 

0.1468(0.0007) 

0.1427(0.0007) 

0.1423(0.0007) 

0.1445(0.0007) 

0.1410(0.0026) 

0.0832(0.0004) 

0.9254(0.0007) 

0.9193(0.0010) 

0.9214(0.0010) 

0.9678(0.0003) 

0.9789(0.0135) 

173(0.8814) 

104(0.3520) 

105(0.3743) 

104(0.3990) 

74(0.2561) 

69(0.7100) 

RYM 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8786(0.0004) 0.0096(0.00008) 0.0585(0.0005) 0.3274(0.0044) 0.2142(0.0003) 4011(2.1176) 

0.9547(0.0004) 0.0299(0.00007) 0.1811(0.0004) 0.2472(0.0003) 0.7709(0.0002) 2395(1.7790) 

0.9611(0.0001) 0.0283(0.0003) 0.1715(0.0007) 0.2646(0.0003) 0.6188(0.0001) 3001(0.8515) 

0.9644(0.0001) 0.0290(0.00009) 0.1760(0.0005) 0.2589(0.0003) 0.6480(0.00001) 2896(0.4095) 

0.9692(0.0002) 0.0306(0.0001) 0.1854(0.0010) 0.2294(0.0004) 0.7394(0.00006) 2506(1.1694) 

0.9705(0.0002) 0.0372(0.00006) 0.2253(0.0003) 0.2031(0.0002) 0.8433(0.0004) 1977(2.6425) 
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Table 5. Analogous to Table 1, but with L = 100. 


Movielens 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8570(0.0023) 

0.8991(0.0020) 

0.9093(0.0016) 

0.9146(0.0014) 

0.9249(0.0011) 

0.9339(0.0011) 

0.0372(0.0005) 

0.0442(0.0005) 

0.0461(0.0006) 

0.0477(0.0006) 

0.0486(0.0001) 

0.0540(0.0004) 

0.4497(0.0070) 0.3601(0.0011) 0.3377(0.0007) 

0.5345(0.0070) 0.3336(0.0006) 0.4825(0.0012) 

0.5569(0.0074) 0.3153(0.0006) 0.5208(0.0011) 

0.5770(0.0079) 0.3003(0.0006) 0.5946(0.0011) 

0.5877(0.0076) 0.2868(0.0006) 0.6248(0.0008) 

0.6525(0.0051) 0.2543(0.0006) 0.7795(0.0007) 

215(0.3110) 

205(0.3753) 

199(0.3772) 

188(0.3377) 

180(0.3157) 

142(0.3084) 

Netflix 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8101(0.0028) 

0.8714(0.0021) 

0.8858(0.0019) 

0.8877(0.0020) 

0.9056(0.0014) 

0.9173(0.0014) 

0.0136(0.0002) 

0.0185(0.0002) 

0.0197(0.0002) 

0.0211(0.0004) 

0.0210(0.0002) 

0.0290(0.0002) 

0.1130(0.0017) 0.3395(0.0018) 0.1416(0.0003) 

0.1542(0.0018) 0.3034(0.0007) 0.6167(0.0009) 

0.1637(0.0018) 0.2771(0.0006) 0.6727(0.0007) 

0.1758(0.0005) 0.2428(0.0006) 0.8318(0.0003) 

0.1743(0.0005) 0.2167(0.0005) 0.7861(0.0004) 

0.2416(0.0014) 0.1698(0.0001) 0.9117(0.0002) 

449(1.3402) 

378(0.9545) 

358(0.8371) 

292(0.5780) 

293(0.6797) 

203(0.3938) 

Amazon 

AUC 

P 

Recall 

I 

H 

(k) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.6409(0.0028) 0.0030(0.00006) 0.1045(0.0021) 

0.8810(0.0017) 0.0109(0.0001) 0.3783(0.0001) 

0.8844(0.0018) 0.0112(0.0001) 0.3888(0.0034) 

0.8927(0.0012) 0.0126(0.00006) 0.4356(0.0021) 

0.8937(0.0018) 0.0126(0.00008) 0.4362(0.0030) 

0.8987(0.0006) 0.0131(0.00007) 0.4529(0.0002) 

0.0601(0.0006) 

0.0729(0.0001) 

0.0705(0.0001) 

0.0644(0.0001) 

0.0687(0.0001) 

0.0669(0.0003) 

0.0480(0.0001) 

0.8309(0.0006) 

0.8287(0.0006) 

0.7771(0.0007) 

0.9217(0.0002) 

0.9415(0.0001) 

112(0.1743) 

71(0.1036) 

71(0.1162) 

112(0.1689) 

52(0.1088) 

47(0.1244) 

RYM 

AUC 

P 

Recall 

I 

H 

(fc) 

GRM 

CF 

NBI 

HNBI 

CBI 

UCBI 

0.8786(0.0004) 0.0027(0.00002) 0.1653(0.0015) 0.1199(0.0004) 0.0520(0.00007) 1003(0.8676) 

0.9547(0.0002) 0.0080(0.00002) 0.4900(0.0011) 0.1400(0.00009) 0.7870(0.00008) 841(0.2231) 

0.9611(0.0001) 0.0082(0.00002) 0.4992(0.0013) 0.1369(0.0001) 0.7602(0.00008) 880(0.3522) 

0.9644(0.0001) 0.0086(0.00001) 0.5230(0.0008) 0.1352(0.0001) 0.7834(0.00003) 858(0.3592) 

0.9692(0.0001) 0.0091(0.00005) 0.5519(0.00003) 0.1186(0.0001) 0.8195(0.00008) 781(0.5006) 

0.9705(0.0002) 0.0096(0.00001) 0.5838(0.0006) 0.1045(0.0001) 0.8681(0.00007) 672(0.3495) 













